As the conventional physical cluster system fails to cope flexibly with large-scale Internet applications, a comprehensive load balancing mechanism for cloud-based virtual cluster system was proposed. It first periodically collected CPU and memory usage, number of connections, and response time of all virtual machines and physical hosts, then calculated the weighted load of the physical hosts, and finally scheduled and assigned the task requests based on the calculated comprehensive load, thus could adapt to the complex, dynamic and variable computing environment. The experimental results show that, compared with other scheduling mechanisms such as Weighted Round Robin (WRR) and Weighted Least Connections (WLC), the proposed mechanism is delay optimal under heavy workload, and moreover, it can increase or decrease the number of Virtual Machines (VMs) dynamically to balance the server load usually within 5 seconds.